-
Notifications
You must be signed in to change notification settings - Fork 811
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix out of bounds read in bit chunk iterator #416
Fix out of bounds read in bit chunk iterator #416
Conversation
let next = | ||
unsafe { std::ptr::read_unaligned(raw_data.add(index + 1)).to_le() }; | ||
let next = unsafe { | ||
std::ptr::read_unaligned(raw_data.add(index + 1) as *const u8) as u64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix here is casting the pointer back to *u8 and reading only a single byte.
The other changes are a bit of a cleanup, the masking of next
below should not be needed since it masked of exactly the bits that would be shifted out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is not the remainder, don't we potentially need to read more than 8 bits? I.e. doesn't this index contain between 1 and 63 bits that need to be "merged" into current
?
I get a feeling that this will ignore all bits after the 8th and less than 64. At least this is what I remember from fixing it in arrow2 here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The constructor ensures that the bit_offset is between 0..8, this means we need to be able to read unaligned u64, but need at most one additional byte. I'll add this as a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right 👍 Good thinking. Thanks for the clarification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jhorstmann
I reviewed the code and ran test_iter_remainder_out_of_bounds
under MIRI on my machine:
cargo +nightly miri test -p arrow -- test_iter_remainder_out_of_bounds
With the code change in this PR:
No failures are reported
Without the code change in this PR:
running 1 test
test util::bit_chunk_iterator::tests::test_iter_remainder_out_of_bounds ... error: Undefined Behavior: memory access failed: pointer must be in-bounds at offset 4103, but is outside bounds of alloc1293907 which has size 4096
--> /Users/alamb/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:788:9
|
788 | copy_nonoverlapping(src as *const u8, tmp.as_mut_ptr() as *mut u8, mem::size_of::<T>());
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ memory access failed: pointer must be in-bounds at offset 4103, but is outside bounds of alloc1293907 which has size 4096
|
= help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
= help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
= note: inside `std::ptr::read_unaligned::<u64>` at /Users/alamb/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:788:9
note: inside `<util::bit_chunk_iterator::BitChunkIterator as std::iter::Iterator>::next` at arrow/src/util/bit_chunk_iterator.rs:144:26
--> arrow/src/util/bit_chunk_iterator.rs:144:26
|
144 | unsafe { std::ptr::read_unaligned(raw_data.add(index + 1)).to_le() };
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
= note: inside `<util::bit_chunk_iterator::BitChunkIterator as std::iter::Iterator>::fold::<std::option::Option<u64>, fn(std::option::Option<u64>, u64) -> std::option::Option<u64> {std::iter::Iterator::last::some::<u64>}>` at /Users/alamb/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/src/rust/library/core/src/iter/traits/iterator.rs:2111:29
= note: inside `<util::bit_chunk_iterator::BitChunkIterator as std::iter::Iterator>::last` at /Users/alamb/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/src/rust/library/core/src/iter/traits/iterator.rs:242:9
note: inside `util::bit_chunk_iterator::tests::test_iter_remainder_out_of_bounds` at arrow/src/util/bit_chunk_iterator.rs:268:30
--> arrow/src/util/bit_chunk_iterator.rs:268:30
|
268 | assert_eq!(u64::MAX, bitchunks.iter().last().unwrap());
| ^^^^^^^^^^^^^^^^^^^^^^^
note: inside closure at arrow/src/util/bit_chunk_iterator.rs:259:5
--> arrow/src/util/bit_chunk_iterator.rs:259:5
|
259 | / fn test_iter_remainder_out_of_bounds() {
260 | | // allocating a full page should trigger a fault when reading out of bounds
261 | | const ALLOC_SIZE: usize = 4 * 1024;
262 | | let input = vec![0xFF_u8; ALLOC_SIZE];
... |
269 | | assert_eq!(0x7F, bitchunks.remainder_bits());
270 | | }
| |_____^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! 💯
* Fix out of bounds read in bit chunk iterator * Add comment why reading one additional byte is enough
Which issue does this PR close?
Closes #198.
Rationale for this change
The previous code could read a few bytes out of bounds. I could not cause this to trigger a segmentation fault, but miri also detected a problem and can now run the tests without errors.
What changes are included in this PR?
Are there any user-facing changes?
No